Comparative analysis of CNN filter sizes

Evaluating impact of receptive field in Encoder-Decoder and U-Net models for Lane Detection Segmentation

Susanta Deka, Kalyani Kotti (Advisor: Dr. Cohen)

Invalid Date

Image Classification vs Semantic Segmentation

  • Classification - assign a label

  • Image classification — single label

  • Cat or Dog

Image Classification vs Semantic Segmentation

  • Semantic Segmentation -> assign a label to each pixel

  • Output -> Mask

Convolution Layer

  • Typical CNN architecture

Convolution Layer

  • Feature Extraction

Convolution Layer - Animation

  • Matrix Dot Product

Convolution Layer - Math

  • Matrix Dot Product

  • Sum of Element-Wise multiplication

  • Downsampling

Transposed Convolution

  • Opposite of Convolution

  • Upsample

  • Expand

CNN Encoder-Decoder Architecture

  • Convolution Layers in Encoder

  • Transposed Convolution Layer in Decoder

U-Net

  • Double Convolution Layers

  • Transposed Convolution Layer in Expanding Path

  • Skip Connections (Concat) — Spatial Context

Model Parameter Count

Model Parameters Difference with UNet
CNN 3x3 3,139,587
UNet 3x3 31,037,763 27,898,176
CNN 5x5 8,713,219
UNet 5x5 81,241,411 72,528,192
CNN 7x7 17,073,667
UNet 7x7 156,546,883 139,473,216

Training Setup

  • PyTorch DataLoaders – Batch training

  • Epoch – 10

  • Loss – PyTorch CrossEntropy

PyTorch Cross Entropy

  • Uses softmax and Negative Log Likelihood internally
  • Raw model outputs (logits): [2.5, 1.2, 0.8]
  • After softmax: [0.65, 0.18, 0.17]
  • If the correct class is index 0, NLL loss = -log(0.65) ≈ 0.43
  • If the correct class is index 2, NLL loss = -log(0.17) ≈ 1.77

Training Time

  • UNet-3


Training Time

  • UNet-7


Training Time

  • CNN-7


Training Time

  • CNN-5

Model Parameter Count

Model Parameters Difference with UNet
CNN 3x3 3,139,587
UNet 3x3 31,037,763 27,898,176
CNN 5x5 8,713,219
UNet 5x5 81,241,411 72,528,192
CNN 7x7 17,073,667
UNet 7x7 156,546,883 139,473,216

Comparison of Memory Usage

Performance of Models

  • IoU

  • Dice Coefficient

Performance of Models

Performance of Models

Top Performers

  • CNN-5

  • UNet-3

Model Parameter Count

Model Parameters Difference with UNet
CNN 3x3 3,139,587
UNet 3x3 31,037,763 27,898,176
CNN 5x5 8,713,219
UNet 5x5 81,241,411 72,528,192
CNN 7x7 17,073,667
UNet 7x7 156,546,883 139,473,216

Predictions - CNN-5

Predictions - UNet-3

Conclusion

  • CNN-5 > UNet-3

  • The UNet-3 performs slightly better than the CNN-5

  • Dice score .885 > .85 and the IoU score of .805 > .755

  • CNN-5 is 75% smaller than UNet-3